Introduction to Open Data Science - Course Project

About the project

Initial feelings, expectations

I am looking forward to the course for two reasons:

  • would like to get a structured introduction to how R is used for data science, as I have learnt this previously on my own and with much help from a friend of mine
  • hopefully learn some new features too

Another initial impression of the course to me that the moodle page is rather verbose and chaotic and this hampers finding the necessary information and the assigned tasks. I hope that I will get used to this and will not miss any compulsary assignment.

I have found the course in Sisu because I still needed one credit to my Transferable skills section.

Learning experience on the R for Health Data Science

## [1] "I've found the book interesting and well written. Naturally due to my previous experience in R, I was only skim-reading it. Nevertheless I have found some useful functionalities that I haven't used before. e.g."
  1. the usage of logical operators in the filter function filter(year == 2020 | disease == COVID19)
  2. I tend to forget how to dodge columns but I was reminded and produced some nice plots
gapdata2007 %>% 
  ggplot(aes(x = continent, y = lifeExp, fill = country)) +
  geom_col(position="dodge") + theme(legend.position = "none") 

  1. I was not aware of the percent() function in tidyverse, it’s much simpler than sprintf(%2.1f%%).
  2. It was nice to learn that with plotly and html output of Rmarkdown one can create interactive htmls (as one can hover over the diagram below to highlight countries).
plot = gapdata2007 %>% 
  ggplot(aes(x = gdpPercap, y = lifeExp, color = continent, label = country)) +
  geom_point()
  
ggplotly(plot)

Insert chapter 2 title here

Describe the work you have done this week and summarize your learning.

date()
## [1] "Sun Nov 06 12:02:25 2022"

Here we go again…


(more chapters to be added similarly as we proceed with the course!)